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© Acoustic calibration arrangement for a voice switched speakerphone. 



© An acoustic calibration circuit (113) in a voice switched adaptive speakerphbne accurately determines the 
type of acoustic environment in which the speakerphone is employed. The calibration circuit measures the 
acoustics of the room by emitting a tone burst through a loudspeaker (112) associated with the speakerphone 
and measuring the returned time-domain acoustic response with a microphone (111) also associated with the 
speakerphone. Obtained from this response and processed by a computer (110) in the speakerphone are the 
maximum amplitude of the returned signal, and the duration of the echoes. The amplitude of the returned signal 
determines what level of transmit speech will be required to break in on receive speech. The greater the 
acoustic return, the higher that threshold must be to protect against self-switching. And the duration of the 
echoes determine how quickly speech energy injected into the room will dissipate, which, in turn, controls how 
fast the speakerphone can switch from a receive to a transmit state. If the room acoustics are harsh, the 

^speakerphone adapts by keeping its switching response comparable with that of a typical analog speakerphone. 

^If acoustics are favorable, however, the speakerphone speeds up the switching time, lowers both the break in 
thresholds and the total amount of switched loss. This decrease in switched loss, while the speakerphone is in a 

Jggood acoustic environment, provides the user with noticeably more transparent performance. 
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Acoustic Calibration Arrangement for a Voice Switched Speakerphone 

Background of the Invention 



l. Technical Field 

s 

This invention relates to audio systems and, more particularly, to voice switching circuits which connect 
to an audio line for providing two-way voice switched communications. 



70 2. Description of the Prior Art 

The use of analog speakerphones have been the primary hands free means of communicating during a 
telephone conversation for a great number of years. This convenient service has been obtained at the price 
of some limitations, however. These speakerphone usually require careful and expensive calibration in order 

75 to operate in an acceptable manner. They are also designed to operate in a worstcase acoustic environment 
thereby sacrificing the improved performance that is possible in a better acoustic environment. 

The operation of conventional analog speakerphones is well known and is described in an article by A. 
Busala, "Fundamental Considerations in the Design of a Voice-Switched Speakerphone/ Bell System 
Technical Journal, Vol. 39, No. 2. March 1960, pp 265-294. Analog speakerphones generally use a 

20 switched-loss technique through which the energy of the voice signals in both a transmit and a receive 
direction are sensed and a switching decision made based upon that information. The voice signal having 
the highest energy level in a first direction will be given a clear talking path and the voice signal in the 
opposite direction will be attenuated by having loss switched into its talking path. If voice signals are not 
present in either the transmit direction or the receive direction, the speakerphone goes to an "at rest" mode 

2S which provides the clear talking path to voice signals in a receive direction favoring speech from a distance 
speaker. In some modern analog speakerphones, if voice signals are not present in either the transmit 
direction or the receive direction, the speakerphone goes to an idle mode where the loss in each direction 
is set to a mid-range level to allow the direction wherein voice signals first appear to quickly obtain the clear 
talking path. 

30 Most high-end analog speakerphones also have a noise-guard circuit to adjust the switching levels 
according to the level of background noise present. Switching speed is limited by a worst-case time 
constant that assures that any speech energy in the room has time to dissipate. This limitation is necessary 
to prevent "self switching", a condition where room echoes are falsely detected as near-end speech. A 
disadvantage of this type of speakerphone is that no allowance is made for a room that has good acoustics. 

35 i.e. low echo energy return and short duration echoes. 

With the advent of echo cancelers, echo cancelling speakerphones have become available in the art. 
These speakerphones are complex and expensive devices that, like the analog speakerphones, attempt to 
maintain a balance in an inherently unstable environment The echo cancelling speakerphones available in 
the art require a user to initiate a white noise start up sequence upon entering each telephone call. This 

40 white noise burst into the environment is used by the echo canceller to develop a frequency and phase 
response for the system loop. From this information, a sampled-time impulse response for the loop is 
developed, which includes both the room acoustics and the hybrid response. This sampled-time impulse 
response is a series of signed coefficients that when convolved with the received signal will cancel this 
signal and yield only the desired transmit signal. In operation, the echo cancellers first determine this 

45 impulse response, then invert it by changing the sign of each coefficient. When the received signal is 
passed through the inverted impulse response, the received signal due to the echoes is cancelled. 

A difficulty associated with building suitable echo cancelling speakerphones is that the acoustic 
environment in which these speakerphones must operate is highly variable and the hybrid environment will 
change with each call. In an ideal acoustic and hybrid environment, the echo cancellers ought to be able to 

so characterize the system loop based solely on the users speech. Since such an environment cannot 
generally be assured, the user is required to initiate the disruptive white noise burst at the start of each call 
to initialize the echo cancellers. 

Although each of the above speakerphone systems provide reasonable two-way hands free communica- 
tions for a user, ft is desirable to have an efficient and cost effective speakerphone without the disadvan- 
tages and limitations associated with the operation of these systems. 

2 
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Summary of the Invention 

In accordance with the present invention, an acoustic calibration circuit is employed in an adaptive 
speakerphone for accurately determining the type of acoustic environment in which the speakerphone is 

5 employed. The calibration circuit measures the acoustics of the room by emitting a tone burst through a 
loudspeaker associated with the speakerphone and measuring the returned time-domain acoustic response 
with a microphone also associated with the speakerphone. 

Obtained from the time-domain acoustic response and processed by a computer in the speakerphone 
are the maximum amplitude of the returned signal, and the duration of the echoes. The amplitude of the 

10 returned signal determines what level of transmit speech will be required to break in on receive speech. 
The greater the acoustic return, the higher that threshold must be to protect against self-switching. And the 
duration of the echoes determine how quickly speech energy injected into the room will dissipate, which, in 
turn, controls how fast the speakerphone can switch from a receive to a transmit state. If the room acoustics 
are harsh, the speakerphone adapts by keeping its switching response comparable with that of a typical 

75 analog speakerphone. If acoustics are favorable, however, it speeds up the switching time, lowers both the 
break in thresholds and the total amount of inserted switched loss. 

Thus in operation, if the speakerphone is located in a good acoustic environment, the total amount of 
switched loss required can be significantly less than worst case. Also the decrease in switched loss while in 
a good acoustic environment provides the user with noticeably more transparent performance, 

20 

Brief Description of the Drawing 

FIG. 1 is a block representation of the major functional components of a computer controlled adaptive 
25 speakerphone operative in accordance with the principles of the invention; 

FIG. 2 is a partial schematic of the speakerphone including a calibration circuit, an amplifier for 
remotely provided speech signals, a microphone and an associated amplifier and multiplexers employed in 
this invention; 

FIG. 3 is a partial schematic of the speakerphone including mute controls and high pass filters 
30 employed in this invention; 

FIG. 4 is a schematic of a programmable attenuator and a low pass filter employed in a transmit 
section of this invention; 

FIG. 5 is a schematic of a programmable artenuator and a low pass filter employed in a receive 
section of this invention; 

35 FIG. 6 depicts a general speakerphone circuit and two type of coupling that most affect its operation; 

FIG. 7 is a state diagram depicting the three possible states of the speakerphone of FIG. 1; 
FIG. 8 depicts a flow chart illustrating the operation of the speakerphone of FIG. 1 in determining 
whether to remain in an idle state or move from the idle state to a transmit or a receive state; 

FIG. 9 depicts a flow chart illustrating the operation of the speakerphone of FIG. 1 in determining 
40 whether to remain in the transmit state or move from the transmit state to the receive state or idle state; 

FIG. 10 depicts a flow chart illustrating the operation of the speakerphone of FIG. 1 in determining 
whether to remain in the receive state or move from the receive state to the transmit state or idle state; 

FIG. 11 are illustrative waveforms which depict impulse and composite characterizations of an 
acoustic environment performed by the speakerphone of FIG. 1; 
4S FIG. 12 is a block representation of the functional components of a speakerphone operable in 

providing echo suppression loss insertion; 

FIG. 13 depicts a flow chart illustrating the operation of the speakerphone of FIG. 12 in the 
application of echo suppression loss insertion; and 

FIG. 1 4 are waveforms illustrating the application of echo suppression loss insertion. 

so 

Detailed Description 

FIG. 1 is a functional block representation of a computer controlled adaptive speakerphone 100 
operative in accordance with the principles of the invention. As shown, the speakerphone generally 
55 comprises a transmit section 200, a receive section 300, and a computer 110. A microcomputer commer- 
cially available from Intel Corporation as Part No. 8051 may be used for computer 110 with the proper 
programming. A microphone 111 couples audio signals to the speakerphone and a speaker 112 receives 
output audio signals from the speakerphone. 

3 
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By way of operation through illustration, an audio signal provided by a person speaking into the 
microphone 111 is coupled into -,e transmit section 200 to a multiplexer 210. In addition to being able to 
select the microphone speech s.r-nal as an input the multiplexer 210 may also select calibration tones as 
its input. These calibration tones are provided by a calibration circuit 113 and are used, in this instance, for 
5 calibration of the hardware circuitry in the transmit section 200. 

Connected to the multiplexer 210 is a mute control 211 which mutes the transmit path in response to a 
control signal from the computer 110. A high pass filter 212 connects to the mute control 211 to remove the 
room and low frequency background noise in the speech signal. The output of the high pass filter 212 is 
coupled both to a programmable attenuator 213 and to an envelope detector 214. In response to a control 

w signal from the computer 110, the programmable artenuator 213 inserts loss in the speech signal in three 
and one half dB steps up to a total of sixteen steps, providing 56 dB of total loss. This signal from the 
programmable artenuator 213 is coupled to a low pass filter 215 which removes any spikes that might have 
been generated by the switching occurring in the attenuator 213. This filter also provides additional signal 
shaping to the signal before the signal is transmitted by the speakerphone over audio line 101 to a hybrid 

is (not shown). After passing through the envelope detector 214, the speech signal from the filter 212 is 
coupled to a logarithmic amplifier 216, which expands the dynamic range of the speakerphone to 
approximately 60 dB for following the envelope of the speech signal. 

The receive section 300 contains speech processing circuitry that is functionally the same as that found 
in the transmit section 200. A speech signal received over an input audio line 102 from the hybrid is 

20 coupled into the receive section 300 to the multiplexer 310. Like the multiplexer 210, the multiplexer 310 
may also select calibration tones for its input, which are provided by the calibration circuit 1 1 3. Connected 
to the multiplexer 310 is a mute control 311 which mutes the receive path in response to a control signal 
from the computer 110. A high pass filter 312 is connected to the mute control 311 to remove the low 
frequency background noise from the speech signal. 

25 The output of the high pass filter 312 is coupled both to an envelope detector 314 and to a 
programmable anenuator 313. The envelope detector 314 obtains the signal envelope for the speech signal 
which is then coupled to a logarithmic amplifier 316. This amplifier expands the dynamic range of the 
speakerphone to approximately 60 dB for following the envelope of the receive speech signal. The 
programmable artenuator 313, responsive to a control signal from the computer 110, inserts loss in the 

30 speech signal in three and one half dB steps in sixteen steps, for 56 dB of loss. This signal from the 
programmable attenuator 313 is coupled to a low pass filter 315 which removes any spikes that might have 
been generated by the switching occurring in the attenuator 313. This filter also provides additional signal 
shaping to the signal before the signal is coupled to the loudspeaker 112 via an amplifier 114. 

The signals from both the logarithmic amplifier 216 and the logarithmic amplifier 316 are multiplexed 

35 into an eight-bit analog-to-digital converter 115 by a multiplexer 117. The converter 115 presents the 
computer 110 with digital information about the signal levels every 750 microseconds. 

The computer 110 measures the energy of the incoming signals and develops information about the 
signal and noise levels. Both a transmit signal average and a receive signal average are developed by 
averaging samples of each signal according to the following equation: 



if lsl t 2yt-i 
if lsl t <y t .i 



where 

Sampling rate = 1 333 per second 
|s| t =new sample 
y t -i = old average 
yi = new average 

This averaging technique tends to pick out peaks in the signal applied. Since speech tends to have 
many peaks rather than a constant level, this average favors detecting speech. 

Both a transmit noise average and a receive noise average are also developed. The transmit noise 
average determines the noise level of the operating environment of the speakerphone. The receive noise 

4 
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yt-i + 



lslt-yt-i 

4 
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average measures the noise level on the line from the far-end party. The transmit noise average and the 
receive noise average are both developed by measuring the lowest level seen by the converter 115. Since 
background noise is generally constant, the lowest samples provide a reasonable estimate of the noise 
level. The transmit and receive noise averages are developed using the following equation: 



10 



yi-i + — Tzzz — rf 's't^yt-i 



yi-i + 



4096 

lslt-yt-i 



if lsl t <y,-i 



75 where 

Sampling rate = 1333 per second 
|s| t = new sample 
y t -i = old average 
y t = new average 

20 This equation strongly favors minimum values of the envelope of the applied signal, yet still provides a 
path for the resulting average to rise when faced with a noisier environment. 

Two other signal levels are developed to keep track of the loop gain, which affects the switching 
response and singing margin of the speakerphone. These signal levels are the speech level that is present 
after being attenuated by the transmit attenuator 213 and the speech level that is present after being 

25 attenuated by the receive attenuator 313. In the speakerphone, these two levels are inherently known due to 
the fact that the computer 110 directly controls the loss in the attenuators 213 and 313 in discrete amounts, 
3.5 dB steps with a maximum loss of 56 dB in each attenuator. All of these levels are developed to provide 
the computer 110 with accurate and updated information about what the current state of the speakerphone 
should be. 

30 As in all speakerphones, the adaptive speakerphone needs to use thresholds to determine its state. 
Unlike its analog predecessors, however, those thresholds need not be constant. The computer 110 has the 
ability to recalibrate itself to counteract variation and aging of hardware circuitry in the speakerphone. This 
is achieved by passing a first and a second computer-generated test tone through the transmit path and the 
receive path of the hardware circuitry and measuring both responses. 

35 These test tones are generated at a zero dB level and a minus 20 dB level. The difference measured 
between the zero dB level tone and the minus 20 dB level tone that passes through the speakerphone 
circuitry is used as a base line for setting up the thresholds in the speakerphone. First, by way of example, 
the zero dB level tone is applied to the transmit path via multiplexer 210 and that response measured by 
the computer 110. Then the minus 20 dB tone is similarly applied to the transmit path via multiplexer 210 

40 and its response measured by the computer. The difference between the two responses is used by the 
computer as a basic constant of proportionality that represents "20 dB" of difference in the transmit path 
circuitry. This same measurement is similarly performed on the receive path circuitry by applying the two 
test tones via multiplexer 310 to the receive path. Thus, a constant of proportionality is also obtained for this 
path. The number measured for the receive path may be different from the number measured by the 

45 transmit path due to hardware component variations. The computer simply stores the respective number for 
the appropriate path with an assigned value of minus 20' dB to each number. Once the computer has 
determined the number representing minus 20 dB for each path, it is then able to set the required dB 
threshold levels in each path that are proportionally scaled to that path's number. Also, because of the 
relative scaling, the common thresholds that are set up in each path always will be essentially equal even 

50 though the values of corresponding circuit components in the paths may differ considerably. 

As part of the calibration process, the speakerphone also measures the acoustics of the room In which 
it operates. Through use of the calibration circuit 113, the speakerphone generates a series of eight 
millisecond tone bursts throughout the audible frequency of interest and uses these in determining the time- 
domain acoustic response of the room, each tone burst is sent from the calibration circuit 113 through the 

55 receive section 300 and out the loudspeaker 112. The integrated response, which is reflective of the echoes 
in the room from each tone burst, is picked up by the microphone 1 1 1 and coupled via the transmit section 
200 to the computer 110 where it is stored as a composite response pattern, shown in FIG. 11 and 
described in greater detail later herein. This response is characterized by two important factors: the 

5 
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maximum amplitude of the returned signal, and the duration of the echoes. The amplitude of the returned 
signal determines what level of transmit speech will be required to break in on receive speech. The greater 
the -acoustic return, the higher that threshold must be to protect against self-switching. The duration of the 
echoes determine how quickly speech energy injected into the room will dissipate, which controls how fast 

5 the speakerphone can switch from a receive to a transmit state, if the room acoustics are harsh, therefore, 
the speakerphone adapts by keeping switching response on a par with that of a typical analog device. But 
when acoustics are favorable, it speeds up the switching time and lowers break in thresholds to provide a 
noticeable improvement in performance. 

The concept of self-calibration is also applied to the speakerphone^ interface to a hybrid. During a 

70 conversation, the computer measures the degree of hybrid reflection that it sees. This hybrid reflection 
provides a measure of both the hybrid and far-end acoustic return. Its average value is determined using 
the following equation: 



15 



20 




where 

2 5 Sampling rate = 1 333 per second 
R t = receive signal average 
T t = transmit signal average 

= old hybrid average 
H t = new hybrid average 

30 This equation develops the hybrid average value by subtracting a transmit signal from a receive signal 
and then averaging these signals in a manner that favors the maximum difference between them. The 
receive signal is that signal provided to the speakerphone by the hybrid on the receive line and the transmit 
signal is that signal provided to the hybrid by the speakerphone on the transmit line. By developing an 
estimate of the hybrid average, the amount of switched loss required in the speakerphone to maintain 

35 stability may be raised or lowered. By lowering the amount of switched loss, speakerphone switching 
operation becomes more transparent and can even approach full-duplex for fully digital connections. 

The estimate of the hybrid average is also used to determine the switching threshold level of the 
speakerphone in switching from the transmit state to the receive state (receive break in). Since the estimate 
of the hybrid average is used to develop an expected level of receive speech due to reflection, additional 

4Q receive speech due to the far-end talker may be accurately determined and the state of the speakerphone 
switched accordingly. 

To obtain an accurate representation of the line conditions, hybrid averaging is performed only while the 
speakerphone is in the transmit state. This insures that receive speech on the receive line during a quiet 
transmit interval cannot be mistaken for a high level of hybrid return. This averaging therefore prevents 

4S receive speech, that is not great enough to cause the speakerphone to go into the receive state, from 
distorting the estimated hybrid average. 

Another boundary condition employed in developing this hybrid average is a limitation on the 
acceptable rate of change of transmit speech. If transmit speech ramps up quickly, then the possibility of 
sampling errors increases. To avoid this potential source of errors, the hybrid average is only developed 

50 during relatively fiat intervals of transmit speech (the exact slope is implementation-dependent). 

To ensure stable operation with an adaptive speakerphone in use at both the near-end and the far-end 
by both parties, the amount that the hybrid average may improve during any given transmit interval is also 
limited. In the adaptive speakerphone 100, for example, the hybrid average is allowed to improve no more 
than 5 dB during each transmit state. In order for the hybrid average to improve further, a transition to 

55 receive and then back to transmit must be made. This insures that the far-end speakerphone has also had 
an opportunity to go into the transmit state and has similarly adapted. Thus, each speakerphone is able to 
reduce its inserted loss down to a point of balance in a monotonic fashion. Limiting the amount of change in 
the hybrid average during a transmit interval also allows this speakerphone to to be operable with other 
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adaptive speakerphones such as echo-canceling speakerphones that present a varying amount of far-end 
echo as they adapt. 

For ease of operation and for configuring the speakerphone, a user interface 120 through which the 
user has control over speakerphone functions is provided internal to the speakerphone 100. This interface 

5 includes such speakerphone functions as ON/OFF, MUTE and VOLUME UP/DOWN. The user interface also 
includes a button or other signaling device for initiating the recalibration process. Should the user relocate 
his or her speakerphone, pressing this button will perform an acoustic calibration to the new environment. In 
addition, the recalibration process checks the operational readiness of and recalibrates the internal hardware 
circuitry, and resets the volume level of the speakerphone to the nominal position. 

w Referring now to FIGS. 2 and 3. there is shown a partial schematic of the speakerphone 100 including 
the multiplexers 210 and 310, mute controls 211 and 311, the calibration circuit 113, the microphone 111 
and its associated amplifier 117, amplifier 135 for the remotely provided speech signals, and high pass 
filters 211 and 311. 

Shown in greater detail is the microphone 111 which, in this circuit arrangement, is an electret 

/s microphone for greater sensitivity. This microphone is AC coupled via a capacitor 116 to an amplifier 117 
which includes resistors 118 and 119 for setting the transmit signal gain from the microphone 111. From the 
amplifier 117, the speech signal is sent to the multiplexer 210 in the transmit section 200. 

Also shown in greater detail is the calibration circuit 113 which receives a two-bit input from the 
computer 110 on lines designated as CALBIT UP and CALBIT DOWN. This two-bit input provides the tone 

20 burst signal used in the hardware circuitry and acoustic calibration processes. Three states from the two-bit 
input are defined and available: LOW reflects a zero level signal where the input signals on both CALBIT UP 
and CALBIT DOWN are one; HIGH reflects a condition where the input signals to both CALBIT UP and 
CALBIT DOWN are zero; and MIDDLE reflects a condition where, for example, the CALBIT UP signal is one 
and the CALBIT DOWN signal is zero. By alternately presenting and removing the respective input signals 

25 to both CALBIT UP and CALBIT DOWN in a desired sequence, a tone burst is generated which starts from 
ground level, goes up to some given positive vottage level, then down to some given negative voltage level, 
then returns back to ground level. 

The CALBIT UP and CALBIT DOWN signals are respectively provided as input signals to an amplifier 
121 via a first series connection, comprising diode 122 and resistor 123, and a second series connection, 

30 comprising diode 124 and resistor 125. The amplifier 121 and associated circuitry, capacitor 127 and 
resistor 128, are used to generate the desired output level reflective of the summation of the two input 
signals. A resistor divider, comprising resistors 156 and 157, provides an offset voltage to the non-inverting 
input of amplifier 121. Resistor divider, comprising resistors 129 and 130, provide the 20 dB reduction of 
the signal level from amplifier 121. This reduction is used for the comparison measurement when the 

35 speakerphone performs the electrical calibration process. Thus the signal on line 131 is 20 dB less than the 
signal on line 132. Both of these two signals are coupled to the multiplexers 210 and 310. 

A receive audio input level conversion circuit, comprising amplifier 135, resistors 136,137 and 138, and 
also capacitor 139, is connected to audio input line 102 for terminating this line in 600 ohms. This signal is 
coupled from the amplifier 135 to the multiplexer 310 along with the tone signal from amplifier 121 for 

40 further processing. 

The output of the multiplexer 210 is provided over line 138 to a mute control 211 which mutes the 
transmit path in response to a control signal from the computer 110 over line 140. Similarly, the output of 
the multiplexer 310 is provided over line 139 to a mute control 311 which mutes the receive path in 
response to a control signal from the computer 110 over line 141. Respectively connected to the mute 

45 controls 211 and 311 are high pass filters 212 and 213. These high pass filters are essentially identical and 
are designed to remove the low frequency background noise in the speech signal. Filter 212 comprises a 
follower amplifier 217, and associated circuitry comprising capacitors 218 and 219, and resistors 220 and 
221. The output of filter 212 is coupled over line 142 to the programmable artenuator 213 shown in FIG. 4. 
And filter 312 comprises a follower amplifier 317, and associated circuitry comprising capacitors 318 and 

so 319, and resistors 320 and 321. The output of filter 312 is coupled over line 143 to the programmable 
attenuator 313 shown in FIG. 5. 

Referring now to FIG. 4, there is shown a detail schematic of the programmable attenuator 213. This 
attenuator comprises multiple sections which are formed by passing the output of an amplifier in one 
section through a switchable voltage divider and then into the input of another amplifier. The signal on line 

55 142 from the high pass filter 212 is coupled directly to a first section of the attenuator 213 comprising a 
voltage divider consisting of resistors 222 and 223. a switch 224 and a follower amplifier 226. When the 
Switch 224 is closed shorting resistor 222, the voltage developed across the voltage divider essentially will 
be the original input voltage, all of which develops across resistor 223. Once the switch is opened, in 

7 
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25 



response to a command from the computer 110. the signal developed at the juncture of resistors 222 and 
223 is reduced from that of the original input voltage level to the desired lower level. The loss is inserted in 
each section of the attenuator in this manner. 

Thus in operation, a speech signal passing through the first section of the attenuator is either passed at 
s the original voltage level or attenuated by 28 dB. If the switch is turned on. i.e.. the resistor 222 shorted out. 
then no loss is inserted, if the switch is turned off. then 28 dB of loss is inserted. The signal then goes 
through a second similar section which has 14 dB of loss. This second section of the attenuator 213 
comprises a voltage divider consisting of resistors 227 and 228, a switch 229 and a follower amplifier 230. 
This second section is followed by a third section which has 7 dB of loss. This third section of the 
io attenuator 213 comprises a voltage divider consisting of resistors 231 and 232. a switch 233 and a follower 
amplifier 234. A fourth and final section has 3 1/2 dB of loss. This final section of the attenuator 213 
comprises resistors 235 and 236 and a switch 237. By selecting the proper combination of on/off values for 
switches 224,229, 233 and 237. the computer 110 may select from 0 to 56 dB of loss in 3 1/2 dB 
increments. It should be understood that if a finer control of this artenuator is desired such that it could 
75 select attenuation in 1.75 dB increments, it is but a simple matter for one skilled in the art, in view of the 
above teachings, to add another section to the attenuator thereby providing this level of control. 

This signal from the programmable attenuator 213 is coupled to the low pass filter 215 which provides 
additional shaping to the transmit signal. Low pass filter 215 comprises a follower amplifier 238, and 
associated circuitry comprising capacitors 239 and 240, and resistors 241 and 242. The output of filter 215 
is coupled to a transmit audio output level conversion circuit, comprising amplifier 144, resistors 145.146 
and 147, and also capacitor 148, for connection to the audio output line 101. This output level conversion 
circuit provides an output impedance of 600 ohms for matching to the output line 101. 

Referring now to FIG. 5, there is shown a detail schematic for the programmable attenuator 313, the low 
pass filter 315 and the amplifier 114 for the loudspeaker 112. The same basic components are' used in 
implementing the programmable anenuator 313 and the programmable attenuator 213. Because of this and 
the detailed description given to attenuator 213. this attenuator 313 will not be described in similar detail. 

Follower amplifiers 326.330 and 334 along with resistors 322,323.327,328. 331,332.335 and 336, and 
also switches 324,329,333 and 337 combine in forming the four sections of the artenuator 313. As in 
attenuator 213. a speech signal is attenuated 28 dB by section one, 14 dB by section two and7 dB and 3 
30 1/2 dB by sections three and four respectively. 

The signal from the programmable attenuator 313 is coupled to the low pass filter 315 which provides 
additional shaping to the receive signal. Low pass filter 315 comprises a follower amplifier 338, and 
associated circuitry including capacitors 339 and 340, and resistors 341 and 342. In amplifier 114, an 
amplifier unit 149 and associated circuitry, variable resistor 150, resistors 151 and 152, and capacitors 153 
35 and 154. provide gain for the output signal from low pass filter 315 before coupling this signal to the 
speaker 1 12 via a capacitor 155. 

With reference to FIG. 6, there is shown a general speakerphone circuit 600 for describing the two type 
of coupling, hybrid and acoustic, that most affect the operation of a speakerphone being employed in a 
telephone connection. A hybrid 610 connects the transmit and receive paths of the speakerphone to a 
40 telephone line whose impedance may vary depending upon, for example, its length from a central office, as 
well as, for example, other hybrids in the connection. And the hybrid only provides a best case 
approximation to a perfect impedance match to this line. Thus a part of the signal on the transmit path to 
the hybrid returns over the receive path as hybrid coupling. With this limitation and the inevitable acoustic 
coupling between a loudspeaker 611 and a microphone 612, transmit and receive loss controls 613 and 614 
45 are inserted in the appropriate paths to avoid degenerative feedback or singing. 

In accordance with the invention, the computer controlled adaptive speakerphone 100 of FIG. 1 
advantageously employs a process or program described herein with reference to a state diagram of FIG 7 
and flow diagrams of FIGS. 8,9 and 10 for improved performance. This process dynamically adjusts the 
operational parameters of the speakerphone for the best possible performance in view of existing hybrid 
so and acoustic coupling conditions. 

Referring now to FIG. 7. there is shown the state diagram depicting the possible states of the 
speakerphone 100. The speakerphone initializes in an idle state 701. While in this state, the speakerphone 
has a symmetrical path for entering into either a transmit state 702 or a receive state 703, according to 
which of these two has the stronger signal. If there is no transmit or receive speech while the speakerphone 
55 is in the idle state 701, the speakerphone remains in this state as indicated by a loop out of and back into 
this idle state. Generally, if speech is detected in the transmit or receive path, the speakerphone moves to 
the corresponding transmit or receive state. If the speakerphone has moved to the transmit state 702, for 
example, and transmit speech continues to be detected, the speakerphone then remains in this state. If the 
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speakerphone detects receive speech having a stronger signal than the transmit speech, a receive break-in 
occurs and the speakerphone moves to the receive state 703. If transmit speech ceases and no receive 
speech is present, the speakerphone returns to the idle state 701. Operation of the speakerphone in the 
receive state 703 is essentially the reverse of its operation in the transmit state 702. Thus if there is receive 
s speech following the speakerphone moving to the receive state 703, the speakerphone stays in this state. If 
transmit speech successfully interrupts, however, the speakerphone goes into the transmit state 702. And if 
there is no receive speech while the speakerphone is in the receive state 703 and no transmit speech to 
interrupt, the speakerphone returns to the idle state. 

Referring next to FIG. 8, there is shown a flow chart illustrating in greater detail the operation of the 
70 speakerphone 100 in determining whether to remain in the idle state or move from the idle state to the 
transmit state or receive state. The process is entered at step 801 wherein the speakerphone is in the idle 
state. From this step, the process advances to the decision 802 where it determines whether the detected 
transmit signal is greater than the transmit noise by a certain threshold. If the detected transmit signal is 
greater than the transmit noise by the desired amount, the process proceeds to decision 803. At this 
75 decision, a determination is made as to whether the detected transmit signal exceeds the expected transmit 
signal by a certain threshold 

The expected transmit signal is that component of the transmit signal that is due to the receive signal 
coupling from the loudspeaker to the microphone. This signal will vary based on the receive speech signal, 
the amount of switched loss, and the acoustics of the room as determined during the acoustic calibration 
20 process. The expected transmit level is used to guard against false switching that can result from room 
echoes; therefore, the transmit level must exceed the expected transmit level by a certain threshold in order 
for the speakerphone to switch into the transmit state. 

If the detected transmit signal does not exceed the expected transmit signal by the threshold, the 
process advances to decision 806. If the detected transmit signal exceeds the expected transmit signal by 
25 the threshold, however, the process advances to step 804 where a holdover timer is initialized prior to the 
speakerphone entering the transmit state. Once activated, this timer keeps the speakerphone in either the 
transmit state or the receive state over a period of time, approximately 1.2 seconds, when there is no 
speech in the then selected state. This allows a suitable period for bridging the gap between syllables, 
words and phrases that occur in normal speech. From step 804 the process advances to step 805 where 
30 the speakerphone enters the transmit state. 

Referring once again to step 802, if the detected transmit signal is not greater than the transmit noise 
by a certain threshold, then the process advances to the decision 806. In this decision, and also in decision 
807, the receive path is examined in the same manner as the transmit path in decisions 802 and 803. In 
decision 806, the detected received signal is examined to determine if it is greater than the receive noise 
35 by a certain threshold. If the detected receive signal is not greater than the receive noise by this threshold, 
the process returns to the step 801 and the speakerphone remains in the idle state. If the detected receive 
signal is greater than the receive noise by the desired amount, the process proceeds to decision 807. At 
this decision, a determination is made as to whether the detected receive signal exceeds the expected 
receive signal by a certain threshold. 
40 The expected receive signal represents the amount of speech seen on the receive line that is due to 
transmit speech coupled through the hybrid. This signal is calculated on an ongoing basis by the 
speakerphone and depends on the hybrid average, the amount of switched loss, and the transmit speech 
signal. Since the transmit speech path is open to some extent while the speakerphone is in the idle state, 
this causes a certain amount of hybrid reflection to occur, which, in turn, causes a certain amount of the 
45 speech signal detected on the receive path to be due to actual background noise or speech in the room. 
This, in turn, is read as a certain expected level of receive speech. And the actual receive speech signal 
must surpass this expected level by the threshold in order for the speakerphone to determine with certainty 
that there is actually a far-end party talking. 

If the detected receive signal does not exceed the expected receive signal by the threshold, the 
50 process returns to the step 801 and the speakerphone remains in the idle state. If the detected receive 
signal exceeds the expected receive signal by the threshold, however, the process advances to step 808 
where the holdover timer is initialized. From step 808 the process advances to step 809 where the 
speakerphone is directed to enter the receive state. 

Referring next to FIG. 9, there is shown a flow chart illustrating in greater detail the operation of the 
55 speakerphone 100 in determining whether to remain in the transmit state or move from the transmit state to 
either the receive state or idle state. The process is entered at step 901 wherein the speakerphone has 
entered the transmit state/From this step, the process advances to the decision 902 where a determination 
is made as to whether the detected receive signal exceeds the expected receive signal by a certain 
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threshold. If the detected receive signal does not exceed the expected receive signal by the threshold, the 
process advances to decision 907. If the detected receive signal exceeds the expected receive signal by 
the threshold, however, the process advances to step 903 where the the detected received signal is 
examined to determine rf it is greater than the receive noise by a certain threshold. If the detected receive 
5 signal is not greater than the receive noise by this threshold, the process advances to decision 907. if the 
detected receive signal is greater than the receive noise by the desired amount, the process proceeds to 
decision 904. 

At decision 904, a determination is made as to whether the detected receive signal is greater than the 
detected transmit signal by a certain threshold. This decision is applicable when the near-end party and the 

70 far-end party are both speaking and the far-end party is attempting to break-in and change the state of the 
speakerphone. If the detected receive signal is not greater than the detected transmit signal by the 
threshold, the process proceeds to decision 907. If the detected receive signal is greater than the detected 
transmit signal by the threshold, however, the process proceeds to step 905 where the holdover timer is 
initialized for the receive state. From step 905, the process advances to step 906 where it causes the 

is speakerphone to enter the receive state. 

At decision 907, the process checks to see if the detected transmit signal is greater than the transmit 
noise by a certain threshold. If the detected transmit signal is greater than the transmit noise by the desired 
amount, the holdover timer is reinitialized at step 908, the process returns to step 901 and the speaker- 
phone remains in the transmit state. Each time the holdover timer is reinitialized for a certain state, the 

20 speakerphone will remain minimally in that state for the period of the holdover timer, 1 .2 seconds. 

If at decision 907, the process finds that the detected transmit signal is less than the transmit noise by 
a certain threshold, i. e„ no speech from the near-end party, the process advances to the decision 909 
where it determines if the holdover timer has expired. If the holdover timer has not expired, the process 
returns to step 901 and the speakerphone remains in the transmit state. If the holdover timer has expired, 

25 the process advances to step 910 and the speakerphone returns to the idle state. 

Referring next to FIG. 10. there is shown a flow chart illustrating in greater detail the operation of the 
speakerphone 100 in determining whether to remain in the receive state or move from the receive state to 
either the transmit state or idle state. The process is entered at step 1001 wherein the speakerphone has 
entered the receive state. From this step, the process advances to the decision 1002 where a determination 

30 is made as to whether the detected transmit signal exceeds the expected transmit signal by a certain 
threshold. If the detected transmit signal does not exceed the expected transmit signal by the threshold, the 
process advances to decision 1007. If the detected transmit signal exceeds the expected transmit signal by 
the threshold, however, the process proceeds to step 1003 where the the detected transmit signal is 
examined to determine if it is greater than the transmit noise by a certain threshold. If the detected transmit 

35 signal is not greater than the transmit noise by this threshold, the process advances to decision 1007. If the 
detected transmit signal is greater than the transmit noise by the desired amount, the process proceeds to 
decision 1004. 

At decision 1004, a determination is made as to whether the detected transmit signal is greater than the 
detected receive signal by a certain threshold. This decision is applicable when the far-end party and the 

40 near-end party are both speaking and the near-end party is attempting to break-in and change the state of 
the speakerphone. If the detected transmit signal is not greater than the detected receive signal by the 
threshold, the process proceeds to decision 1007. If the detected transmit signal is greater than the 
detected receive signal by the threshold, however, the process proceeds to step 1005 where the holdover 
timer is initialized for the transmit state. From step 1005. the process advances to step 1006 where it 

45 causes the speakerphone to enter the transmit state. 

At decision 1007, the process checks to see if the detected receive signal is greater than the receive 
noise by a certain threshold. If the detected receive signal is greater than the receive noise by the desired 
amount, the holdover timer is reinitialized at step 1008, the process returns to step 1001 and the 
speakerphone remains in the receive state. 

so If at decision 1007. the process finds that the detected receive signal is less than the receive noise by a 
certain threshold, i. e., no speech from the far-end party, the process advances to the decision 1009 where 
it determines if the holdover timer has expired. If the holdover timer has not expired, the process returns to 
step 1001 and the speakerphone remains in the receive state. If the holdover timer has expired, the process 
advances to step 1010 and the speakerphone returns to the idle state. * 

55 Referring now to FIG. 11, there is shown illustrative waveforms which provide an impulse and a 
composite characterization of an acoustic environment obtained during the acoustic calibration process 
performed by the speakerphone 100. A tone signal, generated between 300 HZ and 3.3 KHz in fifty equal 
logarithmically spaced frequency steps, is applied to the loudspeaker 112 of the speakerphone and the 
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return echo for each tone measured by the microphone 111 and analyzed by the computer 110. Samples of 
the return echo for each tone signal generated are taken at 10 millisecond intervals for a total sampling 
period of 120 milliseconds. 

The sample impulse responses shown in FIG. 11 are for the four frequencies, 300 Hz, 400 Hz, 500 Hz 

s and 3.3 KHz. As illustrated in this figure, the 300 Hz response initially has a fairly high amplitude (A), but 
the energy quickly dissipates after the tone stops. In the 400 Hz response, its amplitude (A) is initially lower, 
however, the energy does not dissipate as rapidly as in the 300 Hz response. And the energy in the 500 Hz 
response dissipates even slower than the 300 Hz and the 400 Hz impulse responses. 

A composite waveform is generated next to each 300 Hz, 400 Hz and 500 Hz impulse response. This 

w composite waveform represents an integrated response pattern of the impulse responses. The 300 Hz 
impulse response and the 300 Hz composite response are identical since this is the first measured 
response. The subsequent composite responses are modified based on the new information that comes in 
with each new impulse response. If that new information shows any ten millisecond time interval with a 
higher amplitude return than is then on the composite response for the corresponding time interval, the old 

is information is replaced by the new information. If the new information has a lower amplitude return than that 
on the composite for that corresponding time interval, the old information is retained on the composite 
response. The 3.3 KHz frequency tone is the last of the 50 tones to be generated. The composite response 
after this tone represents, for each ten millisecond time interval, essentially the worst case acoustic coupling 
that may be encountered by the speakerphone during operation, independent of frequency. 

20 This measure of the initial characterization of the room acoustic environment in which the speakerphone 
operates is used in a number of ways. The composite response is used for setting a switchguard threshold 
which insures that receive speech, if coming out of the loudspeaker is not falsely detected as transmit 
speech and returned to the far-end party. 

The composite response is also used for determining the total amount of loop loss necessary for proper 

25 operation of the speakerphone. The amount of receive speech signal, that is returned through the 
microphone from the loudspeaker is used as part of the equation which also includes the amount of hybrid 
return, the amount of loss inserted by the programmable attenuators and the gain setting of the volume 
control to determine the total amount of loop loss. 

The composite response is further used in determining the expected transmit level. This expected 

30 transmit level is obtained from a convolution of the composite impulse response with the receive speech 
samples. The receive speech samples are available in real time for the immediately preceding 120 
milliseconds with sample points at approximately 10 millisecond intervals. The value of the sample points 
occurring at each 10 millisecond interval in the receive response are convolved with the value of the sample 
points corresponding to the same 10 millisecond intervals in the composite response. In this convolution, 

35 the sampled values of the received speech response are, on a sample point by sample point basis, 
multiplied by the corresponding values of the sample points contained in the composite response. The 
resulting products are then summed together to obtain a single numerical value which represents the 
convolution of the immediately preceding 120 milliseconds of receive speech and 120 milliseconds of initial 
room characterization. This numerical value represents the amount of receive speech energy that is still in 

40 the room and will be detected by the microphone. 

The following example illustrates how the convolution of the composite response with the received 
speech provides for more efficient operation of the speakerphone. If, by way of example, the near-end party 
begins talking and the speakerphone is in the receive state receiving speech from the far-end party, a 
certain amount of the signal coming out of the loudspeaker is coupled back into the microphone. The 

45 speakerphone has to determine whether the speech seen at the microphone is due solely to acoustic 
coupling, or whether it is due to the near-end talker. This determination is essential in deciding which state 
the speakerphone should be entering. To make this determination, the computer convolves the composite 
impulse response of the room with the receive speech signal to determine the level of speech seen at the 
microphone that is due to acoustic coupling. If the amount of signal at the microphone is greater than 

so expected, then the computer knows that the near-end user is trying to interrupt and can permit a break-in; 
otherwise, the speakerphone will remain in the receive state. 

When a speakerphone type device is operated in a near full or full duplex mode, the far-end party's 
speech emanating from the loudspeaker is coupled back into the microphone and back through the 
telephone line to the far-end. Because of the proximity of the loudspeaker to the microphone, the speech 

56 level at the microphone resulting from speech at the loudspeaker is typically much greater than that 
produced by the near-end party. The result is a loud and reverberant return echo to the far-end. To alleviate 
this unpleasant side effect of near full or full duplex operation, an echo suppression process, which inserts 
loss in the transmit path as appropriate, is employed. 
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A diagram generally illustrating the insertion of echo suppression loss during near full or full duplex 
operation is shown in FIG 12. The speech signal in the receive path is measured by a measuring system 
1210. Such a measuring system, by way of example, is available from high pass filter 312. envelope 
detector 314 and logarithmic amplifier 316 shown in FIG. 1. The output of measuring system 1210 is 
s passed through an acoustic coupling equation 1211 in order to include the effects of acoustic coupling on 
the signal to be seen at the microphone. The acoustic coupling equation could be as simple as a fast 
attack, slow decay analog circuit. In this implementation, the acoustic coupling equation is the composite 
room impulse response that is generated during the acoustic calibration phase of the calibration process. 
The output of the equation is the expected transmit signal level described earlier herein. The resulting 

to signal is then used to provide a control signal for the modulation of the transmit path loss. An echo 
threshold detection circuit 1212 monitors the amplitude of the control signal from the acoustic coupling 
equation 1211. When the control signal exceeds a predetermined threshold (below which the return echo 
would not be objectionable to the far-end party) transmit loss which tracks the receive speech is inserted 
into the transmit path by the modulation circuit 1213. 

15 By monitoring the transmit and receive speech signals, the process determines when the speech signal 
into the microphone is a result of acoustically coupled speech from the loudspeaker. While the speaker- 
phone is operating, the expected transmit signal level is also constantly monitored This level is a direct 
indication of loudspeaker to microphone coupling and loop switched loss. This expected transmit level will 
tend to get larger as the speakerphone approaches full duplex operation. When this signal exceeds an echo 

20 threshold (below which the return echo would not be objectionable to the far-end party), additional loss is 
inserted into the transmit path. This echo suppression loss, when needed, tracks the receive speech 
envelope at a syllabic rate after a 1 to 5 millisecond delay. 

Referring next to FIG. 13, there is shown a flow diagram illustrating the decision making process for the 
application of echo suppression loss. The process is entered at decision 1301 where the transmit signal 

25 level is compared with the expected transmit signal level plus a coupling threshold. If the expected transmit 
signal level plus the coupling threshold is less than the measured transmit signal, the process advances to 
step 1302 since receive speech is not present and echo suppression is therefore not necessary. If the 
expected transmit signal level plus the coupling threshold is greater than the measured transmit signal, the 
process advances to decision 1303 since the speakerphone is emanating speech from the loudspeaker that 

30 may need to be suppressed. 

At decision 1303, a determination is made as to whether the loop switched loss is great enough to 
obviate the need for additional echo suppression loss. If loop switched loss is greater than the coupling 
threshold, the process advances to step 1304 since the switched loss will prevent objectionable echo to the 
far-end and echo suppression is not necessary. If loop switched loss is not great enough to provide 

35 sufficient echo reduction, however, the process advances to decision 1305. 

At decision 1305, a determination is made as to whether the expected level of the transmit signal is 
greater than the loop switched loss plus an echo threshold If so, the process advances to step 1306 since 
the return echo would not be objectionable to the far-end party and echo suppression is not necessary. If, 
however, the expected level of the transmit signal is less than the loop switched loss plus an echo 

40 threshold, echo suppression is necessary and the process advances to step 1307. The echo suppression is 
then inserted into the transmit path at step 1307 as follows: loss = expected transmit level - (loop switched 
loss - echo threshold). 

Shown in FIG. 14 is a waveform illustrating how, in speakerphone 100, loss is inserted into the transmit 
path via programmable attenuator 213 in accordance with the echo suppression process. 
45 Although a specific embodiment of the invention has been shown and described, it will be understood 
that it is but illustrative and that various modifications may be made therein without departing from the spirit 
and scope of the invention as defined in the appended claims. 



so Claims 

1. A voice switching apparatus for processing speech signals on a communication line including means 
for switching between a receive state for receiving speech signals from the communication line and a 
transmit state for transmitting speech signals over the communication line, CHARACTERIZED IN INCLUD- 
55 ING 

an acoustic calibration circuit (313) for determining the type of acoustic environment in which the voice 

switching apparatus is employed, the calibration circuit comprising: 

means (1 1 0,1 13,310.1 12) for generating a tone burst signal in said environment; and 
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measuring means (110,111,200) responsive to the return of the tone burst signal to said apparatus for 
measuring the resulting time-domain acoustic response of said environment; 

calibration means (110) operably responsive to the measuring means for adjusting threshold switching 
levels at which the apparatus switches between the receive state and the transmit state. 

s 2. The acoustic calibration circuit as in claim 1 FURTHER CHARACTERIZED IN THAT the tone burst 

signal comprises multiple frequency signals generated separately at different time intervals and for a 
common fixed time period, and the time-domain acoustic response comprises a composite representation 
of each one of the multiple frequency signals having the largest amplitude measured at each one of 
multiple predetermined time intervals for providing the amplitude of the acoustic response and the duration 

10 of echoes from the tone burst signal. 

3. The acoustic calibration circuit as in claim 2 FURTHER CHARACTERIZED IN THAT the measuring 
means further comprise comparison means for periodically comparing the time-domain acoustic response 
with received speech signals having a comparable time period, the calibration means responsive to the 
comparison means adjusting the threshold switching levels for switching between the receive state and the 

rs transmit state. 

4. A voice switching apparatus as in claim 1 including variable switched loss means for alternately 
inserting loss in a receive path for attenuating the speech signals received from the communication line and 
in a transmit path for attenuating the speech signals for transmission over the communication line, 
FURTHER CHARACTERIZED IN THAT; 

20 The calibration means operably responsive to the measuring means adjusting the level of attenuation 
inserted by the variable switched loss means into the transmit path and the receive path. 

5. The acoustic calibration circuit as in claim 4 FURTHER CHARACTERIZED IN THAT the measuring 
means further comprises comparison means for periodically comparing the time-domain acoustic response 
with received speech signals having a comparable time period, the calibration means responsive to the 

25 comparison means adjusting the level of attenuation inserted by the variable switched loss means in the 
receive path and the transmit path. 

6. A method of determining the type of acoustic environment in which a voice signal controller is 
employed, the voice signal controller, being connectable to a communication line and switching between a 
receive state for receiving speech signals from the communication line and a transmit state for transmitting 

30 speech signals over the communication line, CHARACTERIZED IN THAT the method comprises the steps 
of: 

generating a tone burst signal in said environment; 

measuring the return of the tone burst signal to said controller for generating a time-domain acoustic 
response representative of the acoustic environment; and 
35 adjusting threshold switching levels at which the controller switches between the receive state and the 
transmit state responsive to the measuring step. 

7. The method of determining the type of acoustic environment as in claim 6 FURTHER CHARACTER- 
IZED IN THAT the tone burst signal comprises multiple frequency signals generated separately at different 
time intervals and for" a common fixed time period, and the time-domain acoustic response comprises a 

40 composite representation of each one of the multiple frequency signals having the largest amplitude 
measured at each one of multiple predetermined time intervals for providing the amplitude of the acoustic 
response and the duration of echoes from the tone burst signal. 

8. The method of determining the type of acoustic environment as in claim 7 FURTHER CHARACTER- 
IZED IN THAT the measuring step further includes the step of periodically comparing the time-domain 

45 acoustic response with received speech signals having a comparable time period, the threshold switching 
levels adjusting step, operably responsive to the comparison step, adjusting the threshold switching levels 
for switching between the receive state and the transmit state. 

9. The method of determining the type of acoustic environment as in. claim 6 FURTHER CHARACTER- 
IZED IN the steps of: 

so inserting loss alternately in a receive path for attenuating the speech signals received from the communica- 
tion line and in a transmit path for attenuating the speech signals for transmission over the communication 
line; and 

adjusting the level of attenuation inserted by the loss insertion step in response to the measuring step. 

10. The method of determining the type of acoustic environment as in claim 9 FURTHER CHARACTER- 
55 IZED IN THAT the measuring step further includes the step of periodically comparing the time-domain 

acoustic response with received speech signals having a comparable time period, the loss insertion step, 
operably responsive to the comparison step, adjusting the level of attenuation inserted in the receive path 
and the transmit path. 
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© Acoustic calibration arrangement for a voice switched speakerphone. 



© An acoustic calibration circuit (113) in a voice 
switched adaptive speakerphone accurately deter- 
mines the type of acoustic environment in which the 
speakerphone is employed. The calibration circuit 
measures the acoustics of the room by emitting a 
tone burst through a loudspeaker (112) associated 
with the speakerphone and measuring the returned 
time-domain acoustic response with a microphone 
(111) also associated with the speakerphone. Ob- 
tained from this response and processed by a com- 
puter (110) in the speakerphone are the maximum 
amplitude of the returned signal, and the duration of 
the echoes. The amplitude of the returned signal 
^determines what level of transmit speech will be 
^required to break in on receive speech. The greater 
■^the acoustic return, the higher that threshold must be 
£qIo protect against self-switching. And the duration of 
Irt the echoes determine how quickly speech energy 
injected into the room will dissipate, which, in turn, 
^controls how fast the speakerphone can switch from 
CO a receive to a transmit state. If the room acoustics 
Qare harsh, the speakerphone adapts by keeping its 
switching response comparable with that of a typical 
analog speakerphone. If acoustics are favorable, 
however, the speakerphone speeds up the switching 
time, lowers both the break in thresholds and the 



total amount of switched loss. This decrease in 
switched loss, while the speakerphone is in a good 
acoustic environment, provides the user with notice- 
ably more transparent performance. 
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